Combining Models for the Alignment of Parallel Syntactic Trees
نویسندگان
چکیده
The alignment of syntactic trees is the task of aligning the internal and leaf nodes of two sentences in different languages structured as trees. The output of the alignment can be used, for instance, as knowledge resource for learning translation rules (for rule-based machine translation systems) or models (for statistical machine translation systems). This paper presents some experiments carried out based on two syntactic tree alignment algorithms presented in [Lavie et al. 2008] and [Tinsley et al. 2007]. Aiming at improving the performance of internal nodes alignment, some approaches for combining the output of these two algorithms were evaluated in Brazilian Portuguese and English parallel trees.
منابع مشابه
Extraction of Syntactic Translation Models from Parallel Data using Syntax from Source and Target Languages
We propose a generic rule induction framework that is informed by syntax from both sides of a parsed parallel corpus, as sets of structural, boundary and labeling related constraints. Factoring syntax in this manner empowers our framework to work with independent annotations coming from multiple resources and not necessarily a single syntactic structure. We then explore the issue of lexical cov...
متن کاملDiscriminative Word Alignment with Syntactic Features
This report introduces a study on syntactic features used in a discriminative word alignment model. The features are implemented on a state-of-the-art discriminative word alignment system. The syntactic features are extracted from parse trees. Three types of syntactic features are experimented in this work: one global tree path feature and two first order tree features. Experimental results sho...
متن کاملLanguage engineering for syntactic knowledge transfer
In this paper we present a method for an English-Romanian treebank construction, together with the obtained evaluation results. The treebank is built upon a parallel English-Romanian corpus word-aligned and annotated at the morphological and syntactic level. The syntactic trees of the Romanian texts are generated by considering the syntactic phrases of the English parallel texts automatically r...
متن کاملImproving Syntax Driven Translation Models by Re-structuring Divergent and Non-isomorphic Parse Tree Structures
Syntax-based approaches to statistical MT require syntax-aware methods for acquiring their underlying translation models from parallel data. This acquisition process can be driven by syntactic trees for either the source or target language, or by trees on both sides. Work to date has demonstrated that using trees for both sides suffers from severe coverage problems. This is primarily due to the...
متن کاملDiscriminative word alignment by learning the alignment structure and syntactic divergence between a language pair
Discriminative approaches for word alignment have gained popularity in recent years because of the flexibility that they offer for using a large variety of features and combining information from various sources. But, the models proposed in the past have not been able to make much use of features that capture the likelihood of an alignment structure (the set of alignment links) and the syntacti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011